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changes  significantly  is  much  smaller  than  that  of  the  body  reflection,  it  can  be  shown  that  one  can 
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The  image  of  a  uniform  wall  illuminated  by  a  spot  light  often  gives  a  strong  impression  of  the 
illuminant  color.  How  can  it  be  possible  to  know  if  it  is  a  white  wall  illuminated  by  yellow  light 
or  a  yellow  wall  illuminated  by  white  light?  If  the  wall  is  a  Lambertian  reflector,  it  would  not  be 
possible  to  tell  the  difference.  However,  in  the  real  world,  some  amount  of  specular  reflection  is 
almost  always  present.  In  this  memo,  it  is  shown  that  the  computation  is  possible  in  most  practical 
cases. 

Light  reflection  from  a  surface  is  usually  modeled  as  having  two  components:  the  interface 
(specular)  reflection  and  the  body  (diffuse)  reflection.  For  a  surface  of  inhomogeneous  materia),  the 
spectral  composition  of  the  interface  reflection  component  is  often  similar  to  that  of  the  illuminant. 
The  problem  of  computing  the  illuminant  chromaticity  from  the  shading  of  a  single  smooth  surface 
is  to  separate  these  two  components.  An  image  of  an  illuminated  uniform  wall,  according  to  the 
above  model,  gives  only  one  physical  constraint  about  the  illuminant  chromaticity,  not  enough  to 
determine  a  unique  solution.  However,  since  the  spatial  scale  over  which  the  interface  reflection 
changes  significantly  is  much  smaller  than  that  of  the  body  reflection,  it  can  be  shown  that  one  c^n 
effectively  exploit  the  scale  difference  to  find  a  unique  solution,  which  is  often  very  accurate.  The 
method  cgn  also  be  generalized  to  compute  the  illuminant  chromaticity  for  a  nonuniform  smooth 
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1  Introduction 


When  we  walk  through  many  rooms  inside  a  building  with  different  lightings  in  various 
locations,  we  are  always  aware  of  the  change  of  the  illuminant  color.  Although  there  are 
few  quantitative  measurements  available,  our  experiences  tell  us  that  the  human  visual 
system  seems  to  be  able  to  perceive  qualitatively  the  scene  illuminant  quite  well.  Even 
when  we  have  difficulty  judging  the  “true1'  color  of  a  piece  of  fabrics  under  certain  indoor 
lighting  (an  illustration  of  the  breakdown  of  color  constancy),  we  seldom  fail  to  tell  the 
color  of  the  illuminant  even  without  looking  at  it  directly.  This  effortless  perception  of 
scene  illumination  does  not  depend  on  stereo  or  motion,  as  our  experience  in  watching 
projection  slides  of  still  natural  scenes  can  tell  us.  It  does  seem  to  depend  on  the  presence 
of  gradual  shadings  on  the  object  surfaces.  It  is  well  known  that  color  patches  of  uniform 
chromaticity  and  luminance  do  not  give  the  perception  of  illumination.  This  is  called  the 
aperture  (or  film)  mode  of  color  perception.  In  order  to  give  an  impression  of  illumination, 
the  surface  shading  has  to  have  gradual  variations.  This  is  called  the  surface  mode  of  color 
perception,  that  is,  the  perceived  color  seems  to  belong  to  the  object  surface  [7]. 

Computing  the  scene  illuminant  color  from  a  given  color  image  is  not  a  simple  problem. 
The  difficulty  is  that  the  recorded  color  image  irradiances  are  functions  of  the  illuminant, 
the  surface  shape,  and  the  surface  reflectances.  Without  knowing  any  two  of  them,  there 
are  infinite  possible  solutions  (a  mathematically  ill-posed  problem  [9]).  Recent  work  [5]  [12] 
suggests  that  the  specular  (interface  reflection)  component  of  surface  reflection  can  be  used 
to  compute  the  illuminant  chromaticity.  Since  this  method  for  computing  the  illuminant 
color  is  based  on  the  idea  of  finding  the  converging  point  of  the  surface  chromaticity  loci,  we 
will  call  it  the  chromaticity  convergence  method  [5],  which  is  explained  in  the  next  section. 


2  The  Reflection  Model  and  the  Chromaticity  Diagram 

The  general  light  reflection  model  of  a  uniform  surface  for  a  three  color  imaging  system 
used  in  the  chromaticity  convergence  method  is  as  follows  (see  [6]  for  details).  Assume  that 
the  incident  radiance  on  a  surface  can  be  written  as: 

L(0i,<t>„\)  =  c{\)Lo(9i,4>l)  (1) 

with  c(A)  normalized  to  one  at  its  maximum,  and  0,  and  d>,  are  the  incident  angles.  Let 
ftr(A),  Rg( A),  and  Rk(X)  be  the  spectral  responsivity  functions  of  the  color  imaging  system, 
and 

Lr  =  J  c(X)Rr(X)dX 

Lg  =  J  c(X)Rg(X)dX  (2) 

Lb  =  J  c(X)Rb(X)dX, 

then 

Er(x,y)  =  kLr{prf(x,y)  +  pah{x,y)) 
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Eg(x,y)  =  kLg(pgf{x,y)  +  p,h(x,y)) 
Eb(x,y)  =  kLb(pbf(x,y)  +  p3h(x,y)), 


(3) 


where  x  and  y  are  the  spatial  coordinates  on  the  image  plane,  and  r,  g ,  and  6  indicate 
the  conventional  red,  green,  and  blue  channels  or  any  other  combinations  of  them.  Er, 
Eg,  and  Eb  are  the  red,  green,  and  blue  image  irradiances,  pT,  pg,  and  pb  are  the  diffuse 
(body)  reflectance  factors,  p,  represents  the  specular  (interface)  reflectance  factor,  and 
f(x,y )  and  h(x,y)  represent  the  factors  dependent  on  the  imaging  geometry  and  surface 
shapes.  This  reflection  model  makes  a  strong  assumption  that  the  reflectance  factor  of  the 
interface  (specular)  reflection  component  is  independent  of  the  imaging  geometry  as  well 
as  the  spectral  responsivity  of  each  color  channel.  We  will  call  this  reflection  model  the 
neutral  interface  reflection  (NIR)  model.  This  assumption,  of  course,  is  not  strictly  true  in 
practice  (especially  for  homogeneous  material,  as  discussed  in  detail  by  Cook  and  Torrance 
[l]),  but  there  are  experimental  data  showing  that  it  is  a  reasonable  approximation  for 
many  types  of  surface  material  [6],  Although  the  NIR  model  is  a  special  case  of  a  more 
general  model,  the  dichromatic  reflection  model,  described  by  Shafer  [10],  which  does  not 
assume  that  the  interface  reflection  is  neutral,  it  is  often  used  in  real  applications,  because 
of  its  good  approximation  and  simplicity. 

In  general,  one  can  not  recover  the  absolute  magnitudes  of  both  the  light  source  intensity 
and  the  reflectance  factor  at  the  same  time  from  the  image  irradiance  signal  alone,  because 
if  one  raises  the  intensity  by  a  factor  of  2  and  reduces  the  reflectance  factor  by  a  factor  of 
2,  the  image  irradiance  will  remain  the  same.  Therefore,  it  is  useful  to  define  quantities 
which  would  specify  the  color  of  the  light,  independent  of  its  intensity.  For  this  purpose,  the 
chromaticity  of  a  given  beam  of  light  irradiating  the  image  plane  is  defined  by  the  following 
ratios  [11]  (The  standard  notations  for  CIE  chromaticity  coordinates  are  x,  y,  and  z.  To 
avoid  confusion  with  the  spatial  coordinates,  we  will  use  u,  v,  and  w  in  the  text,  and  use 
x,  y,  z  in  the  figures  where  CIE  chromaticity  diagrams  are  used.): 


u  =  Et/{Et  +  Eg  +  Eb), 

v  =  Eg!(Er  +  Eg  +  Eb),  (4) 

w  =  Eb/{Er  +  Eg  +  Eb). 


Since  u  +  v  w  =  1,  one  needs  only  u  and  v  to  specify  the  chromaticity  coordinates  of 
the  light.  It  is  also  easy  to  see  that  they  are  independent  of  the  light  intensity.  One 
useful  property  of  the  chromaticity  coordinates  is  that  if  one  additively  mixes  two  lights 
to  produce  the  third  light,  then  the  chromaticity  coordinates  of  the  third  light  is  a  linear 
combination  of  the  chromaticity  coordinates  of  the  first  two  lights.  Let  (uj,  t>i)  and  (u2, 
v2)  be  the  chromaticity  coordinates  of  the  first  two  lights,  and  let  the  “total  irradiance" 
E  =  Er  4-  Eg  +  m,  of  the  first  light  be  E\  and  that  of  the  second  light  be  E2,  then  it  can 
be  easily  shown  that  the  chromaticity  coordinates  of  the  mixed  light  are: 
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The  important  consequence  of  this  property  for  the  NIR  model  is  that  since  the  light 
reflected  from  a  uniform  surface  is  the  additive  mixture  of  the  interface  reflection  component 
and  the  body  reflection  component,  its  chromaticity  locus  will  fall  on  a  straight  line  segment 
connecting  the  chromaticity  of  the  interface  reflection  and  the  chromaticity  of  the  body 
reflection.  Therefore,  each  uniform  surface  will  show  up  in  the  chromaticity  space  as  a 


Figure  1:  The  chromaticity  convergence  method  requires  two  surfaces  of  different  colors. 


short  line  segment.  If  there  are  two  uniform  surfaces  of  different  colors,  the  intersection 
point  is  the  chromaticity  locus  of  the  illuminant  (see  Figure  1).  This  is  the  essence  of  the 
chromaticity  convergence  method  for  determining  the  illuminant  color. 


3  Can  One  Compute  the  Illuminant  Color  From  A  Single 
Surface? 

The  theory  behind  the  chromaticity  convergence  method  requires  at  least  two  surfaces  of 
different  colors  be  present  in  the  scene  in  order  to  compute  the  illuminant  chromaticity. 
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The  two  surfaces  have  to  reflect  the  illuminating  light  as  mixtures  of  the  specular  and  the 
diffuse  components  in  various  proportions. 

However,  if  one  looks  at  a  uniform  wall  illuminated  by  a  spot  light,  one  has  a  strong 
impression  not  only  of  the  shading  variations,  but  also  of  the  color  of  the  light  source.  How 
can  it  be  possible  for  us  to  perceive  that  it  is  a  white  wall  illuminated  by  yellow  light, 
but  not  a  yellow  wall  illuminated  by  white  light?  Since  there  is  only  one  surface  in  the 
image,  the  theory  of  the  chromaticity  convergence  method  tells  us  that  the  light  source 
chromaticity  is  constrained  to  be  along  a  straight  line  in  the  chromaticity  diagram,  having 
infinitely  many  possible  solutions. 

Some  thought  will  lead  us  to  the  following  two  possible  solutions: 

1.  In  the  evolution  process,  the  visual  system  had  a  long  period  of  time  to  learn  about  the 
regularity  of  ihe  color  variations  in  the  natural  illuminants,  e.g.,  sunlight,  skylight,  and 
fire.  From  early  morning  to  late  afternoon,  daylight  (sunlight  plus  skylight)  continues 
to  change  its  color,  depending  on  the  solar  angle,  the  clouds,  and  the  water  vapor 
content  of  the  air  mass.  The  chromaticities  of  daylight  in  various  phases  of  the  day 
and  various  amount  of  cloudiness  had  been  systematically  measured  [3].  Their  loci 
on  the  CIE  1931  chromaticity  diagram  is  a  fairly  smooth  curve,  almost  parallel  to  the 
chromaticity  loci  of  blackbody  radiation  at  various  temperatures.  Furthermore,  many 
man-made  light  sources  have  similar  regularity  in  their  chromaticity  distributions. 
Partially  due  to  this  reason,  color  temperature  has  been  used  to  specify  the  light 
source  color  [11].  Figure  2  shows  the  CIE  daylight  locus  and  the  chromaticity  locus  of 
the  CIE  standard  illuminant  A.  If  this  chromaticity  curve  is  used  as  a  constraint,  then 
the  straight  line  chromaticity  loci  of  the  light  reflected  from  a  single  surface  can  be 
extended  to  intersect  with  this  illuminant  curve,  yielding  a  unique  solution.  If  the  true 
illuminant  chromaticity  is  not  on  the  illuminant  curve,  the  computed  solution  will  be 
incorrect.  Therefore,  this  solution,  although  biologically  feasible,  is  not  physically  a 
good  solution. 

2.  The  chromaticity  convergence  method  needs  two  surfaces  because  it  does  not  assume 
any  knowledge  about  surface  shape,  lighting  geometry,  or  physical  characteristics 
other  than  those  explicitly  expressed  in  the  NIR  model.  For  example,  the  specular 
reflection  is  more  angle  dependent  than  the  diifuse  reflection.  If  our  visual  system  is 
aware  of  the  difference,  it  may  be  able  to  compute  the  solution  uniquely.  As  we  will 
show  later,  this  is  indeed  possible. 

4  The  Solution  from  the  Smoothness  Constraint 

If  we  inspect  th>  image  irradiance  equations  3  for  a  uniform  surface,  we  will  discover  that  by 
subtracting  a  properly  scaled  green  irradiance  signal  from  the  red  irradiance  signal  we  can 
completely  cancel  spatially  either  the  interface  reflection  component  or  the  body  reflection 
component.  For  example, 

Er(x,y)~  (Y-)Eg(x,y)  =  kLr(pr  -  pg)f(x,y)  (6) 

L9 
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Figure  2:  The  chromaticity  curve  of  CIE  daylight  locus. 
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(7) 


Er(x,y)-(^)Eg(x,y)  =  kLrp.{\  -  £)h(x,y). 

LgPy  pg 

The  important  implication  is  that  if  we  can  find  the  proper  factors,  we  can  completely 
recover  the  shapes  of  the  two  reflection  components,  and  the  proper  factors  happen  to 
determine  the  chromaticity  of  the  illuminant.  It  is  also  true  that  the  difference  signal  need 
not  be  from  the  red  and  the  green  signals.  Any  other  combination  of  the  three  image 
irradiance  signals  will  also  serve  well.  If  we  want  the  factor  to  come  out  as  the  chromaticity 
of  the  illuminant,  we  can  define: 

E(x,y)  =  Er{x,y)  +  Eg(x,y)  +  Eb(x,y) 

=  kL(pf{x,y)  +  psh(x,y)),  (8) 

where 

L  =  Lt  +  Eg  +  Lb, 

P  =  (Lrpr  +  Lgpg  +  Lbpb)/ L.  (9) 

Now,  the  difference  signals  become: 

Er(x,y)  -  {lj-)E(x,y)  -  kLr(pr  -  p)f(x,y) 

Eg(x,y)~  (^)E(x,y)  =  kLg{pg  -  p)f{x,y)  (10) 

Eb(x,y)  -  {~-)E(x,y)  =  kLb(pb  ~  p)f(x,y) 

and  the  factors  and  are  precisely  the  chromaticity  coordinates  of  the  illuminant. 

Similar  expressions  can  be  written  for  the  h(x,  y )  component: 

Er(x,y)  -  (—^)E(x,y)  =  kLrp,{l  - —)h(x,y) 

Lp  p 

Eg{x,y)  -  (~^-)E(x,y)  =  kLgp,(l  -  ^-)h(x,y)  (11) 

Lp  p 

Eb(x,y)-^)E(x,y)  =  kLbp,(\ -  ^)h{x,y) 

Lp  p 

Now  we  know  that  we  can  recover  the  shape  of  one  of  the  reflection  components  by 
computing  Er  -  srE,  if  we  can  select  sr  “properly”.  The  remaining  question  is  how  to 
define  what  we  mean  by  “properly”.  If  we  have  some  prior  knowledge  about  the  functions 
f{x,y)  or  h(x,y),  then  we  can  continue  changing  the  factor  sT  until  the  resulting  difference 
signal  Et  -  srE  behaves  the  way  we  know  it  should.  The  knowledge  need  not  be  precise 
or  even  quantitative.  It  serves  only  as  a  criterion  for  selecting  the  right  sr.  One  obvious 
choice  for  the  needed  constraint  is  to  require  that  the  reflection  component  f(x,y)  be  the 
smoothest  function  among  all  the  possible  alternatives  in  the  family  of  functions  generated 
by  Er  —  srE.  The  choice  does  have  a  good  physical  basis,  because  when  the  two  reflection 
components,  each  having  its  ov  a  shape  arc  mixed  together,  the  combined  function  is  not 
as  smooth  as  the  smoother  component  alone.  This  is  true  bocaus<  of  the  following  two 
reasons: 
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1.  The  spatial  scale  at  which  the  irradiance  change  occurs  is,  in  general,  much  smaller 
for  the  specular  (interface)  component  than  for  the  diffuse  (body)  component.  The 
diffuse  component  as  modeled  by  a  Lambertian  surface  varies  as  a  function  of  cosine, 
while  the  specular  component  is  typically  modeled  as  the  cosine  function  raised  to 
the  20th  or  40th  power.  Therefore,  of  the  two  reflection  components  the  diffuse 
component  f(x,y)  is  almost  always  the  smoother  one. 

2.  The  peak  of  the  diffuse  component  rarely  occurs  at  the  same  place  as  that  of  the 
specular  component,  because  the  former  is  at  the  place  where  the  surface  normal  is 
pointing  to  to  the  light  source,  while  the  latter  occurs  at  the  place  where  the  surface 
normal  is  approximately  the  bisector  of  the  angle  between  the  source  vector  and  the 
viewing  vector.  Any  slight  mixture  of  the  specular  component  h(x,y)  with  the  diffuse 
component  f(x,y)  will  create  an  extra  “bump”  in  the  irradiance  signal.  Therefore, 
if  the  factor  sr  is  not  selected  properly,  the  difference  signal  Er  -  srE  will  not  be  as 
smooth  as  when  it  is. 

One  possible  violation  of  the  first  condition  is  when  the  light  source  is  very  close  to 
the  illuminated  surface,  and  therefore  creates  a  sharply  changing  f(x,y)  because  of  the 
inverse  square  fall-off  of  the  light  intensity,  and  because  of  the  large  change  of  the  incident 
angle  within  a  short  distance.  Also,  a  surface  with  large  curvatures  has  similar  effect. 
Another  possibility  is  a  microscopically  (in  terms  of  the  resolution  of  the  imaging  system) 
very  rough  surface,  making  the  specular  component  very  non-directional.  Violation  of  the 
second  condition  can  happen  under  the  single  light  source  assumption  only  when  the  light 
source  is  located  on  the  optical  axis  of  the  imaging  system,  a  physically  unlikely  situation. 
With  multiple  light  sources  and  mutual  illumination  among  surfaces,  the  second  condition 
can  be  violated,  but  not  frequently.  In  most  practical  cases,  the  two  types  of  violation  are 
rare,  and  the  smoothness  constraint  should  give  a  fairly  reasonable  answer. 

It  is  important  to  understand  that  the  smoothness  constraint  here  refers  to  the  as¬ 
sumption  that  the  body  reflection  function,  f(x,yj,  is  much  smoother  than  the  interface 
reflection,  h(x,y).  It  does  not  imply  that  the  underlying  surface  has  to  be  macroscopically 
smooth.  Because  the  relations  between  the  surface  shape  and  the  shape  of  the  image  ir¬ 
radiance  can  be  very  complicated,  smoothness  in  the  shape  of  the  image  irradiance  signal 
does  not  imply  smoothness  in  the  surface  shape,  and  vice  versa. 

It  seems  that  equations  10  and  11  would  give  two  “good”  solutions  for  selecting  the 
“proper”  sT.  However,  as  a  consequence  of  the  smoothness  constraint,  the  solution  in 
equation  10  is  chosen  most  of  the  times,  because  the  body  reflection  component  f(x,  y )  is 
almost  always  smoother  than  the  interface  reflection  component  h(x,y). 

The  next  question  is  how  to  define  smoothness.  For  computational  reasons,  we  choose 
the  laplacian  operator  for  its  symmetry.  It  is  very  likely  that  another  choice  of  measure  of 
smoothness  could  give  equally  good  or  better  results.  We  can  now  put  the  problem  in  the 
following  mathematical  form;  select  sr  and  sg  such  that  the  following  are  minimized: 

J[V2(Er(x,y)  -  srE(x,y))]2dxdy, 

J[V2(Eg(x,y)-sgE(x,y))]2dxdy.  (12) 
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The  solution  turns  out  to  be  quite  simple: 

fV2Er(x.y)V2E(x,y)dxdy 
Sr  "  fV2E(x,y)V2E(x,y)dxdy 

fV2Eg(x,y)V2E(x,y)dxdy 

fV2E(x,y)V2E(x,y)dxdy'  { 

The  solution  given  by  13  has  the  following  nice  property.  As  we  have  seen  in  Section  2, 
the  chromaticity  of  the  light  reflected  from  a  uniform  surface  falls  on  a  straight  line  pointing 
toward  the  illuminant  chromaticity.  Let  the  straight  line  equation  be:  au  +  bv  =  c,  i.e., 

-(|)  +  l(|)  =  c,  (14) 

or, 

a(Er)  +  b(Eg)  =  c(E).  (15) 

It  can  be  shown  that  the  solution  given  by  13  is  also  on  the  same  line,  and  therefore  it  sat¬ 
isfies  the  (only)  constraint  which  can  be  derived  from  physics  without  making  assumptions 
about  the  surface  shape  and  imaging  geometry.  That  means  the  solution  is  guaranteed  to 
be  a  feasible  solution.  This  is  true  because  the  solution  in  13  is  still  a  linear  combination 
of  the  irradiance  signals.  The  proof  is  as  follows: 


a(sr)  +  b(sg)  = 


fV2ETV2  Edxdy  f  V2  EgV2  Edxdy 
a(  f  V2  EV2  Edxdy  ’  {  fV2EV2 Edxdy  ’ 

f  V2(aEr  +  bEg)V2 Edxdy 
fV2EV2Edxdy 
f  V2(cE)V2Edxdy 
fV2EV2  Edxdy 


If  we  expand  the  solution  13  in  terms  of  the  function  f(x,y)  and  h(x,y),  we  get  the 
following  expressions: 

,  Lr  J(p2s(V2h)2  +  PrPjV2/)2  +  P,(Pr  +  P)V2 hV2  f)dxdV 
Sr  ~  (L}  f(p2(V2h)2  +  p2{V2f)2  +  ?nipV2hV2f)dxdy  ’ 

,  Lg  J(p](V2h)2  -f  pgp(V2/)2  +  p,(pg  +  p)V2hV2f)dxdy 
Sg  ~  '  L  /(p,2(V2h)2  +  p2(V2/)2  +  2 p,pV2hV2f)dxdy  K  1 

They  show  that  the  chromaticity  estimation  is  accurate  only  when  the  term  p2(V2h(x,y))2 
is  much  larger  than  the  sum  of  the  other  terms,  and  this  suggests  methods  to  improve  the 
accuracy  of  the  estimate.  For  example,  we  can  give  more  weight  to  the  regions  of  large 
irradiance  because  the  specular  reflection  region  is  often  brighter  than  the  neighboring 
regions.  Another  way  is  to  compute  the  rate  of  chromaticity  change  at  each  pixel  and 
use  it  as  a  weighting  function,  because  only  when  the  specular  reflection  is  significant  in 
magnitude  relative  to  the  diffuse  reflection  would  it  cause  a  noticeable  chromaticity  change. 
The  following  section  will  show  how  well  the  estimator  works,  and  when  it  could  fail. 


5  Estimating  the  Illuminant  Chromaticity 

5.1  From  the  Shading  of  A  Uniform  Surface 

The  estimator  derived  in  the  last  section  allows  us  to  estimate  only  the  illuminant  chro¬ 
maticity.  In  order  to  show  how  well  the  estimator  works,  we  also  estimate  the  chromaticity 
of  the  body  reflection  component,  so  that  we  can  plot  both  components  at  the  same  time. 
For  a  uniform  surface,  the  algorithm  computes  the  chromaticity  distribution  of  all  the  pix¬ 
els.  Because  this  is  a  uniform  surface,  the  chromaticity  loci  is  a  straight  line  segment  in  the 
chromaticity  space.  To  compute  the  chromaticity  of  the  body  reflection  component,  the 
algorithm  simply  takes  the  end  point  of  the  line  segment  that  is  farthest  away  from  the  es¬ 
timated  illuminant  chromaticity.  This  is  equivalent  to  the  assumption  that  the  pixel  which 
has  the  most  "saturated”  color  has  only  a  negligible  amount  of  interface  reflection,  com¬ 
pared  with  its  body  reflection.  Knowing  the  chromaticities  of  both  reflection  components, 
we  can  separate  the  two  components  by  simply  projecting  the  red,  green,  and  blue  irradi- 
ance  signals  to  the  two  vectors  representing  the  reflection  components  in  the  (Er,  Eg,  E/,) 
space. 

Separation  of  reflection  components  had  been  studied  by  by  Klinker,  Shafer,  and  Kanade 
[4],  and  by  Gershon,  Jepson,  and  Tsotsos  [2].  The  former  group  searches  for  a  “skewed  T” 
signature,  while  the  latter  searches  for  a  “dog  leg”  distribution  in  three-dimensional  color 
space.  Both  methods  require  that  the  specular  reflection  component  rise  and  fall  in  a  very 
short  spatial  distance  within  which  the  diffuse  component  is  essentially  constant.  Although 
the  illuminant  chromaticity  estimator  developed  here  is  based  on  the  same  idea  that  the 
rate  of  change  in  the  specular  reflection  component  is  larger  than  the  diffuse  component,  the 
requirement  is  less  stringent  because  the  estimator  does  not  explicitly  search  for  signatures 
or  patterns. 

Let  us  start  with  experiments  on  one  dimensional  signals.  A  point  light  source  is 
positioned  in  front  of  a  plane.  The  plane  is  tilted  at  an  angle  of  8  with  respect  to  the 
optical  axis  of  the  camera  (see  Fig.  3).  The  spatial  coordinate  system  has  the  origin  at  the 
center  of  the  image  plane.  The  z  axis,  x  =  0,  corresponds  to  the  optical  axis.The  focal  point 
is  located  at  (x  =  0,  z  =  /).  The  object  plane  intersects  with  the  z  axis  at  z  =  —  d.  The 
light  source  position  (a:s,zs)  and  the  angle  8  are  changed  to  produce  different  irradiance 
signals  on  the  image  plane.  The  focal  length  /  is  chosen  to  be  50mm.  The  distance  from  the 
image  plane  to  the  center  of  the  surface  is  395cm.  The  light  source  chromaticity  is  (0.3333, 
0.3333),  and  the  surface  reflectance  factors  (pr,ps,p(>)  are  (1.0,  0.8,  0.4).  The  specular 
reflection  is  calculated  according  to  Phong’s  model  [8]  with  the  cosine  raised  to  the  20th 
power.  The  specular  reflectance  factor  ps  is  2.0.  Two  values  for  the  angle  8  are  used:  50 
and  38  degrees. 

Figures  4,  5,  6,  and  7  show  the  results  of  the  estimation  for  different  light  source 
positions.  The  top  graphs  show  the  input  image  irradiance  signal  from  the  red  channel. 
The  dotted  curves  are  the  true  reflection  components,  f{x ).  and  h(x).  The  bottom  graphs 
show  the  estimated  illuminant  chromaticity,  and  the  two  estimated  reflection  components. 

As  can  be  seen,  the  estimate  is,  in  general,  very  accurate.  The  performance  begins  to 
deteriorate  when  the  light  source  gets  very  close  to  the  camera  (see  Fig.  5)  or  when  the 
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Figure  5:  Separation  of  reflection  components.  Example  II. 


Light  tou nee  at  (xs,  zt):  183cm,  -106cm.  niumininl  chromalicity:  (0.3333, 0.3333).  Angle  :  38  degrees. 


Figure  6:  Separation  of  reflection  components.  Example  III. 
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specular  component  is  very  weak  (see  Fig.  7).  When  the  light  source  gets  very  close  to  the 
camera,  the  light  incident  angle  changes  so  fast  from  one  point  to  another  that  the  diffuse 
component  is  no  longer  “much  smoother”  than  the  specular  component.  In  the  case  of  Fig. 
5,  the  light  source  is  only  65.75  cm  away  from  the  object  plane  surface,  and  neither  one  of 
the  two  components  looks  much  smoother  than  the  other. 

To  test  the  estimator  on  two-dimensional  images,  we  replace  the  plane  with  a  sphere 
whose  center  is  395cm  away  from  the  image  plane.  The  radius  of  the  sphere  is  120cm.  The 
point  light  source  is  located  at  z  =  300cm,  y  =  300cm,  and  z  =  400cm.  The  specular 
reflectance  factor  is  2.0,  and  it  is  modeled  as  the  cosine  raised  to  the  40th  power.  In  this 
experiment,  the  illuminant  chromaticity  is  changed  from  image  to  image,  and  the  estimator 
is  applied  to  each  image  to  estimate  the  illuminant  color.  The  following  table  shows  the 
results: 


image 

Pr  Pa  Pb 

true  illuminant  (w,u) 

estimated  illuminant  (u, v) 

wOOO 

wOOl 

w002 

w003 

1.0  0.4  0.6 
1.0  0.4  0.6 
0.4  0.6  1.0 
1.0  0.4  0.6 

0.5027  0.3995 
0.2631  0.4245 
0.5027  0.3995 
0.1566  0.3586 

0.5757  0.3326 

0.3128  0.3780 

0.4702  0.4109 

0.1878  0.3253 

As  we  discussed  in  the  last  section  that  there  are  several  ways  to  improve  the  estimation 
results  by  selectively  weighting  each  pixel  differently,  i.e.,  to  select  sr  and  ss  such  that  the 
following  are  minimized  : 

/ (W(x,y))[VJ(£r(x,y)  -  stE{x,y))]2dxdy, 
J(W(x,y))[V\Ea(x,y)-sgE(x,y))]2dxdy. 

The  solution  becomes: 

/  W(x,  y)V2Er(x,  y)V*E(x,y)dxdy 
Sr  fW(x,y)V>E(x,y)ViE(x,y)dxdy 
__  /  W(xt  y)V2Eg(x,  y)V3E(x,  y)dxdy 
9  f  W{x ,  y)V3E(x,  y)V*E(x,  y)dxdy  ' 

It  can  easily  be  shown  that  the  modified  solution  still  satisfies  the  chromaticity  collinearity 
constraint.  So  the  improvement  incurs  some  cost  for  computation,  but  does  not  sacrifice  the 
original  nice  property.  Since  the  highlight  is  always  brighter  than  its  surrounds,  a  simple 
weighting  scheme  is  to  make  W(x,  y)  a  function  of  the  image  ircadiance.  The  following  table 
shows  how  much  the  estimation  accuracy  can  be  improved  by  a  simple  weighting  function 
W(x,y)  =  (2?(z,y))<. 


image 

Pr  Pa  Pb 

true  illuminant  (u,v) 

estimated  illuminant  ( u,v ) 

wOOO 

wOOl 

w002 

w003 

1.0  0.4  0.6 
1.0  0.4  0.6 
0.4  0.6  1.0 
1.0  0.4  0.6 

0.5027  0.3995 
0.2631  0.4245 
9.5027  0.3995 
0.1566  0.3586 

0.5037  0.3985 

0.2638  0.4238 

0.5022  0.3996 

0.1570  0.3581 

Fig.  8  shows  the  red  record  of  one  of  the  images  (w000).  Fig.  9  shows  its  estimated  specular 
component,  and  Fig.  10  its  estimated  diffuse  component. 
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Figure  8:  An  image  of  a  uniform  spherical  surface. 


Figure  9:  The  estimated  interface  reflec  tion  component  of  Figure  8. 
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Figure  10:  The  estimated  body  reflection  component  of  Figure  8. 


5.2  From  the  Shading  of  A  Non-uniform  Surface 


Extension  of  the  above  illuminant  estimator  to  a  non-uniform  surface  is  conceptually 
straightforward.  If  we  examine  equations  6  and  7,  we  find  that  the  derivation  of  the 
estimator  based  on  the  smoothness  constraint  still  applies  to  the  case  of  a  non-uniform 
surface,  provided  that  we  can  detect  and  exclude,  from  our  computation  of  laplacians,  the 
edge  pixels  between  regions  of  different  body  relectance  factors.  Although  it  is  well  known 
that  edge  detection  itself  is  difficult  and  sensitive  to  noise,  the  situation  here  is  not  as  bad, 
because  we  can  afford  to  exclude  the  false  edges  caused  by  noise  from  our  computation 
without  losing  much  of  the  essential  information  for  illuminant  color  estimation. 

If  the  non-uniform  surface  consists  of  two  differently  colored  uniform  regions  »  and  j, 
then 


fa  V2ErV2  Edxdy 
Sr  ~  J~  V2  EV2  Edxdy 

ftV2ErV2  Edxdy  JjV2  ErV*  Edxdy 

■’  /,  V2  EV2  Edxdy  +  (°;)  V2EV2  Edxdy  ‘ 

where, 


(19) 


/■  V2  EV2  Edxdy 
r  ,  V2  EV2 Edxdy 

V2  EV2  Edxdy 
ftiV2EV2  Edxdy' 
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The  illuminant  estimator  extended  this  way  therefore  has  a  very  good  physical  inter¬ 
pretation:  it  is  the  weighted  average  of  the  estimates  made  by  each  uniform  region  on  the 
non-uniform  surface.  This  is  true  even  if  the  uniformly  colored  regions  are  not  connected 
together,  because  the  relation  between  /  and  h  should  hold  in  each  region.  For  example, 
on  a  tiled  surface,  all  the  areas  covered  by  the  same  colored  tiles  are  combined  together  to 
make  one  estimate,  which  is  in  turn  combined  with  estimates  from  other  colors.  Therefore, 
in  principle  it  does  not  matter  how  many  regions  any  color  is  broken  into.  Of  course,  in 
practice,  the  boundary  pixels  have  to  be  excluded,  and  therefore  there  is  additional  loss  of 
information  for  every  breakup. 

To  generate  images  of  non-uniform  surface,  random  rectangles  of  different  sizes  and 
colors  are  mapped  to  the  surface  of  a  sphere.  The  center  of  the  sphere  is  placed  again  at 
395cm  away  from  the  image  plane.  The  position  of  the  light  source,  and  the  distribution 
of  colors  of  the  random  rectangles  are  changed  to  generate  images  with  different  shad¬ 
ing  and  color  distributions.  To  exclude  the  boundary  pixels  between  regions  of  different 
colors,  two  approaches  are  tried:  (1)  exclude  pixels  whose  laplacian  magnitudes  are  above 
a  predetermined  threshold;  (2)  exclude  pixels  which  are  explicitly  detected  as  boundary 
pixels.  The  idea  behind  the  first  approach  is  to  treat  any  pixel  with  an  unusually  large 
laplacian  magnitude  as  coming  from  physical  events  not  related  to  the  scale  difference  of 
interest  to  the  estimator.  It  requires  the  determination  of  a  threshold,  and  the  practical 
question  is  how  sensitive  is  the  estimate  to  the  exact  value  of  the  threshold.  We  pick  two 
images,  compute  their  histograms  of  laplacian  values,  and  select  the  threshold  as  0.75  times 
the  average  value  of  the  non  zero  laplacians.  We  use  this  threshold  to  process  12  images. 
The  first  four  images,  rOOO  to  r003,  are  generated  with  the  red,  green,  and  blue  reflectance 
factor  distributions  having  equal  means,  and  the  light  source  located  at  x  =  300cm,  y  = 
300cm,  and  z  =  400cm.  We  are  particularly  interested  in  comparing  our  estimate  with  the 
estimate  given  by  the  gray-world  assumption,  which  takes  the  average  image  irradiances  of 
the  red,  green,  and  blue  records  as  the  illuminant  color.  The  results  are  shown  as  follows: 


image 

true  illuminant  ( u,v ) 

estimated  illuminant  (u,u) 

gray-world  estimate  (u,u) 

9 

0.5027  0.3995 
0.2631  0.4245 
0.5027  0.3995 
0.1566  0.3586 

0.4552  0.3991 

0.2080  0.4864 

0.5046  0.3904 

0.0999  0.2762 

0.4849  0.4176 

0.2552  0.4192 

0.4956  0.4009 

0.1495  0.3718 

As  can  be  seen  in  this  case,  the  gray-world  estimates  are  in  general  more  accurate.  We  then 
generate  two  more  sets  of  images  (sOOO  -  s003,  and  fOOO  -  f003)  which  do  not  have  equal 
means  in  the  red,  green,  and  blue  reflectance  distributions.  The  sOOO  -  s003  images  have 
the  light  source  located  at  the  same  position  as  that  of  the  rOOO  -  r003  images,  while  the 
light  source  of  the  fOOO  -  f003  images  is  located  at  x  =  300cm,  y  =  300cm,  and  z  =  -100cm. 
The  results  are  snown  as  follows: 


fffiffEI 

true  illuminant  (u,u) 

estimated  illuminant  (u,v) 

gray-world  estimate  ( u,v ) 

9 

0.5027  0.3995 
0.2631  0.4245 
0.5027  0.3995 
0.1566  0.3586 

0.5196  0.3542 

0.2780  0.4445 

0.5269  0.3775 

0.1509  0.2705 

0.6761  0.2622 

0.4435  0.3141 

0.6918  0.2453 

0.2830  0.3122 
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image 

true  illuminant  {u,v) 

estimated  illuminant  (u,e) 

gray- world  estimate  ( u,v ) 

fOOO 

fOOl 

f002 

f003 

0.5027  0.3995 
0.2631  0.4245 
0.5027  0.3995 
0.1566  0.3586 

0.5031  0.3S57 

0.2897  0.3068 

0.5322  0.3395 

0.1358  0.2679 

0.6700  0.2707 

0.4550  0.2815 

0.7096  0.2218 

0.2836  0.3250 

This  time  the  failure  of  the  gray-world  estimation  is  serious  indeed.  Comparatively,  our 
illuminant  estimator  is  moro  accurate.  If  we  raise  the  threshold  by  20%,  the  results  remain 
about  the  same,  the  reason  being  that  most  edges  are  very  sharp  in  these  synthetic  images. 
There  are  edges  in  the  very  dark  shadings  which  are  included  in  the  computation  of  the 
estimate.  They  are  not  separable  from  the  shading  simply  by  thresholding,  and  are  the 
major  sources  of  the  error. 

The  second  approach  is  to  use  edge  detection  to  locate  the  boundary  pixels.  A  simple 
gradient  edge  detector  with  non-maximum  suppression  is  applied  to  the  individual  red, 
green,  and  blue  records.  Any  pixels  detected  as  edges  in  one  of  the  color  records  are 
declared  as  color  edges.  To  ensure  that  all  region  boundaries  are  excluded  properly,  pixels 
which  are  less  than  two  pixel  away  from  the  detected  color  edges  are  also  excluded  from 
the  laplacian  computation.  Fig.  II  shows  the  red  record  of  the  image  rOOO,  and  Fig.  12 


Figure  11:  An  image  of  a  non-uniform  spherical  surface. 


its  excluded  pixels  from  edge  detection.  The  following  table  shows  the  estimation  results: 


image 

true  illuminant  (u,v) 

estimated  illuminant  (u,v) 

gray-world  estimate  (u,r) 

0.5027  0.3995 
0.2631  0.4245 
0.5027  0.3995 
0.1566  0.3586 

0.5129  0.3881 

0.2872  0.4239 

0.5118  0.3942 

0.1710  0.3487 

0.6700  0.2707 

0.4550  0.2815 

0.7096  0.2218 

0.2836  0.3250 

We  now  apply  the  simple  weighting  function  W(x,y)  =  (E(x,y))4  as  we  did  in  the  uniform 
surface  cases  to  improve  the  estimates.  The  following  tables  show  the  results: 


true  illuminant  (u,v) 

estimated  illuminant  ( u,v ) 

gray-world  estimate  (u,v) 

n 

0.5027  0.3995 
0.2631  0.4245 
0.5027  0.3995 
0.1566  0.3586 

0.5032  0.3992 

0.2634  0.4252 

0.5001  0.4019 

0.1571  0.3597 

0.4849  0.4176 

0.2552  0.4192 

0.4956  0.4009 

0.1495  0.3718 

image 

true  illuminant  (u,v) 

estimated  illuminant  (u,u) 

gray-world  estimate  (u,v) 

0.5027  0.3995 
0.2631  0.4245 
0.5027  0.3995 
0.1566  0.3586 

0.5040  0.3986 

0.2651  0.4219 

0.5043  0.3978 

0.1589  0.3580 

0.6700  0.2707 

0.4550  0.2815 

0.7096  0.2218 

0.2836  0.3250 

«• 


image 

true  illuminant  (u,v) 

estimated  illuminant  (u,t>) 

gray-world  estimate  (u,v) 

sOOO 

sOOl 

s002 

s003 

0.5027  0.3995 
0.2631  0.4245 
0.5027  0.3995 
0.1566  0.3586 

0.5040  0.3986 

0.2658  0.4231 

0.5045  0.3981 

0.1584  0.3589 

0.6761  0.2622 

0.4435  0.3141 

0.6918  0.2453 

0.2830  0.3122 

If  we  move  the  light  source  to  x  =  300cm.  y  —  300cm,  and  z  =  40000cm,  we  have  the 
following  results: 


image 

true  illuminant  (u,u) 

estimated  illuminant  (u,v) 

gray-world  estimate  (u,v) 

U 

0.5027  0.3995 
0.2631  0.4245 
0.5027  0.3995 
0.1566  0.3586 

0.5039  0.3987 

0.2659  0.4219 

0.5046  0.3977 

0.1581  0.3591 

0.6799  0.2576 

0.4380  0.3294 

0.6830  0.2566 

0.2840  0.3059 

As  can  be  seen,  all  the  estimates  are  consistently  very  accurate.  They  are  now  better 
than  the  gray-world  estimates  even  when  the  images  are  generated  by  gray- world  statistics 
(images  rOOO  to  r003). 

It  should  be  pointed  out  that  the  laplacian  signal  of  the  interface  reflection  component 
h( x,y)  is  relatively  small  compared  with  that  of  a  reflectance  edge.  If  any  reflectance  edge 
pixel  is  not  excluded  from  the  estimation,  the  result  can  be  very  wrong.  For  example,  the 
gradient  edge  detector  sometimes  computes  the  location  of  the  true  edge  incorrectly.  If 
we  have  not  had  excluded  pixels  which  are  neighbors  of  the  detected  edges,  the  estimation 
results  will  be  useless  for  some  images,  as  can  be  seen  in  the  following  tables.  The  columns 
marked  as  1-pixel  and  2-pixel  are  the  estimation  results  from  excluding  pixels  which  are 
less  than  or  equal  to  one  and  two  pixel  distance  away  from  the  detected  edges. 


image 

true  (u,«) 

edge  alone  (u,v) 

1-pixel  (u,v) 

n 

0.5027  0.3995 
0.2631  0.4245 
0-5027  0.3995 
0.1566  0.3586 

0.5044  0.3982 
0.4059  0.3412 
0.5487  0.3622 
0.1621  0.3537 

0.5041  0.3985 
0.2653  0.4216 
0.5044  0.3977 
0.1598  0.3569 

0.5040  0.3986 
0.2651  0.4219 
0.5043  0.3978 
0.1589  0.3580 

6  Concluding  Remarks 

Light  reflection  components  usually  have  different  spatial  scales.  The  interface  (specular) 
reflection  component  typically  has  a  much  finer  scale  than  the  body  (diffuse)  component. 
Because  of  this  scale  difference,  one  can  effectively  separate  them  by  imposing  a  smoothness 
constraint  to  extract  the  shape  of  the  dilfuse  component.  In  doing  so,  an  estimate  of 
the  illuminant  color  can  be  obtained.  This  illuminant  estimator  happens  to  satisfy  the 
chromaticity  oollinearity  constraint  automatically,  which  makes  it  particularly  attractive 
for  applications.  When  it  is  generalized  to  a  non-uniform  surface,  it  essentially  computes 
the  weighted  average  of  the  individual  estimates  made  by  each  of  the  uniformly  colored 
regions,  provided  that  one  can  eliminate  the  edge  pixels. 
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The  estimator  has  been  tested  on  computer  generated  images  with  reasonably  good 
performance.  Unlike  the  gray-world  estimator,  it  is  quite  immune  to  heavy  bias  in  the 
reflectance  statistics  of  the  scene.  However,  in  order  to  use  it  for  a  non-uniform  surface, 
explicit  exclusion  of  edges  is  necessary.  Fortunately,  one  need  not  worry  about  false  edges 
because  they  can  be  excluded  from  the  estimator  without  much  loss  of  information.  On  the 
other  hand,  one  has  to  worry  about  undetected  edges  because  their  inclusion  in  computing 
the  estimator  can  introduce  a  large  error  in  the  estimation. 

The  estimation  accuracy  can  be  greatly  improved  by  weighting  each  pixel  with  a  function 
of  its  irradiance.  Other  types  of  weighting  function  and  definition  of  smoothness  can  be 
used,  but  are  not  pursued  in  this  work.  One  final  question  one  might  ask  is:  Can  this 
estimator  be  applied  to  images  with  many  surfaces  ?  The  answer  is  yes  if  all  the  surfaces 
are  spatially  near  each  other  in  depth.  For  surfaces  far  away  from  each  other,  one  has  to 
distinguish  the  scale  difference  due  to  depth  from  the  scale  difference  between  the  reflection 
components.  Compared  with  the  chromaticity  convergence  method,  the  illuminant-color 
estimator  developed  here  is  simpler  in  implementation.  It  works  well  for  a  single  surface,  and 
does  not  explicitly  search  for  straightline  signatures  in  the  chromaticity  space.  Furthermore, 
the  computation  is  more  local  and  therefore  potentially  can  take  better  care  of  images 
with  multiple  light  sources.  However,  the  chromaticity  convergence  method  does  not  use 
the  difference-in-spatial-scale  assumption,  and  would  not  have  any  problem  when  several 
surfaces  are  at  different  depths.  Combination  of  the  two  methods  to  compute  a  robust 
estimation  of  the  scene-illuminant  chromaticity  is  a  subject  of  future  research. 
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